Objective: To forecast a real time series using ETS and ARIMA models.
set.seed(32296622)
myseries <- aus_retail |>
filter(!(
`Series ID` %in% c(
"A3349561R",
"A3349883F",
"A3349499L",
"A3349902A",
"A3349588R",
"A3349763L",
"A3349372C",
"A3349450X",
"A3349679W",
"A3349378T",
"A3349767W",
"A3349451A"
)
)) |>
filter(`Series ID` == sample(`Series ID`, 1))
The data set used is a subset of the aus_retail data which contains one numeric variable, i.e. Turnover in $Million AUD. aus_retail is a part of tsibbledata package and can also be loaded with the package fpp3 which contains total 9 packages including tsibbledata.
The data is a time series of class tsibble and the source for the same is Australian Bureau of Statistics, catalogue number 8501.0, table 11.(Robjhyndman, n.d.).
The sample data set contains the Turnover data for the ‘hardware, building and garden supplies retailing’ for the state of ‘South Australia’ for the years ‘1982 to 2018’ (calculated each month from April 1982 to December 2018) and contains 441 observations and 5 columns.
The dataset contain the following variables and table 1.1 displays the variable type as well.
data_dict <- data.frame(
S.No. = c("1", "2", "3", "4", "5"),
Variables = c ("State", "Industry", "Series ID", "Month", "Turnover"),
DataType = c("Character", "Character", "Character", "Month", "Numeric")
)
knitr::kable (data_dict , caption = "Data Dictonary") |>
kable_styling(latex_options = c("striped", "hold_position")) |>
kable_paper("hover", full_width = T) |>
scroll_box(width = "100%", height = "300px")
| S.No. | Variables | DataType |
|---|---|---|
| 1 | State | Character |
| 2 | Industry | Character |
| 3 | Series ID | Character |
| 4 | Month | Month |
| 5 | Turnover | Numeric |
myseries
## # A tsibble: 441 x 5 [1M]
## # Key: State, Industry [1]
## State Industry `Series ID` Month Turnover
## <chr> <chr> <chr> <mth> <dbl>
## 1 South Australia Hardware, building and garden … A3349501L 1982 Apr 8.6
## 2 South Australia Hardware, building and garden … A3349501L 1982 May 9.5
## 3 South Australia Hardware, building and garden … A3349501L 1982 Jun 8.4
## 4 South Australia Hardware, building and garden … A3349501L 1982 Jul 10.3
## 5 South Australia Hardware, building and garden … A3349501L 1982 Aug 10.6
## 6 South Australia Hardware, building and garden … A3349501L 1982 Sep 11.5
## 7 South Australia Hardware, building and garden … A3349501L 1982 Oct 10.8
## 8 South Australia Hardware, building and garden … A3349501L 1982 Nov 13.1
## 9 South Australia Hardware, building and garden … A3349501L 1982 Dec 25.4
## 10 South Australia Hardware, building and garden … A3349501L 1983 Jan 9.2
## # ℹ 431 more rows
Table 1.2 displays an yearly average turnover in $Million AUD.
year_average <- myseries |>
mutate(year = year(Month)) |>
index_by(year) |>
summarise(Total_turnover = mean(Turnover))
knitr::kable (year_average , caption = "Yearly Turnover Summary") |>
kable_styling(latex_options = c("striped", "hold_position")) |>
kable_paper("hover", full_width = T) |>
scroll_box(width = "100%", height = "300px")
| year | Total_turnover |
|---|---|
| 1982 | 12.02222 |
| 1983 | 11.17500 |
| 1984 | 13.39167 |
| 1985 | 14.46667 |
| 1986 | 12.91667 |
| 1987 | 14.03333 |
| 1988 | 16.21667 |
| 1989 | 20.43333 |
| 1990 | 23.45000 |
| 1991 | 23.94167 |
| 1992 | 28.28333 |
| 1993 | 27.84167 |
| 1994 | 28.45833 |
| 1995 | 27.71667 |
| 1996 | 31.39167 |
| 1997 | 34.41667 |
| 1998 | 39.23333 |
| 1999 | 42.93333 |
| 2000 | 55.35833 |
| 2001 | 70.86667 |
| 2002 | 74.66667 |
| 2003 | 70.90833 |
| 2004 | 69.04167 |
| 2005 | 62.55833 |
| 2006 | 67.86667 |
| 2007 | 75.57500 |
| 2008 | 96.25000 |
| 2009 | 88.49167 |
| 2010 | 85.64167 |
| 2011 | 85.88333 |
| 2012 | 70.88333 |
| 2013 | 66.82500 |
| 2014 | 66.57500 |
| 2015 | 79.94167 |
| 2016 | 82.10833 |
| 2017 | 92.55833 |
| 2018 | 98.35000 |
myseries |>
features(Turnover, list(mean = mean)) |>
arrange(mean)
## # A tibble: 1 × 3
## State Industry mean
## <chr> <chr> <dbl>
## 1 South Australia Hardware, building and garden supplies retailing 51.1
Here the data has been divided into 4 equal section, each containing 25% of the data.
myseries |>
features(Turnover, quantile)
## # A tibble: 1 × 7
## State Industry `0%` `25%` `50%` `75%` `100%`
## <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 South Australia Hardware, building and garden … 8.4 24.6 53.5 76.7 124.
Figure 1.1 shows that both the ends of this Q-Q plot deviates, and hence, has a fat tail but its center also does not follow a typically straight line. This means that the plot has many extreme values and the values vary throughout as well.
ggplot(myseries, aes(sample = Turnover)) +
geom_qq() +
geom_qq_line(color = "#6C03A0") +
ylab("Sample Turnover Values ($Million AUD)") +
xlab("Theoretical Values") +
theme_minimal() +
theme(
plot.background = element_rect(fill = "#FFF8F7"),
panel.grid.major.y = element_blank(),
axis.text.x = element_text(
color = "#993333",
angle = 45,
size = 8
),
axis.text.y = element_text(colour = "#08007F", size = 8)
)
Figure 1.1: Quantile-Quantile Plot
Figure 1.2 is a histogram and is somewhat similar to Bi-modal data, i.e., it represents 2 peaks. Here, it seems that both peaks have similar densities and the curve is sinusoidal.
ggplot(myseries, aes(x = Turnover)) +
geom_histogram(aes(y = after_stat(density)), fill = "#6EBA6B") +
geom_density(color = "#6C03A0") +
ylab("Density") +
xlab("Turnover ($Million AUD)") +
theme_minimal() +
theme(
plot.background = element_rect(fill = "#FFF8F7"),
panel.grid.major.y = element_blank(),
axis.text.x = element_text(
color = "#993333",
angle = 45,
size = 8
),
axis.text.y = element_text(colour = "#08007F", size = 8)
)
Figure 1.2: Distribution of Turnover
Figure 1.3 shows that:
Trend: There is somewhat an increasing trend visible in the data set.
Seasonality: There seems to be multiplicative seasonality as there is a change in height of the values with time. There are strong peaks observed at the end of each year.
myseries |>
autoplot(Turnover, color = "#C63F2C") +
xlab("Year/Month") +
ylab("Turnover ($Million AUD)") +
theme_minimal() +
theme(
plot.background = element_rect(fill = "#FFF8F7"),
panel.grid.major.y = element_blank(),
axis.text.x = element_text(
color = "#993333",
angle = 45,
size = 8
),
axis.text.y = element_text(colour = "#08007F", size = 8)
)
Figure 1.3: Time Series
According to figure 1.4,the highest turnovers are observed in December in each year and February has vales towards the lower end. This is possible due to the fact that hardware, building and garden work are outdoor tasks and is preferred to be done during summers and summer rises in December in Australia and starts to end by February.
myseries |>
gg_season(Turnover,
labels = "both") +
ylab("Turnover ($Million AUD)") +
scale_colour_viridis_c() +
theme_minimal() +
theme(
plot.background = element_rect(fill = "#FFF8F7"),
panel.grid.major.y = element_blank(),
axis.text.x = element_text(
color = "#993333",
angle = 45,
size = 8
),
axis.text.y = element_text(colour = "#08007F", size = 8)
)
Figure 1.4: Seasonal Plot
In figure 1.5 the blue line shows the mean value for each month for all years. This plot confirms our observation from figure 1.4 that the highest turnover was observed in December and lowest in February.
myseries |>
gg_subseries(Turnover, color = "#156A06") +
ylab("Turnover ($Million AUD)") +
theme_minimal() +
theme(
plot.background = element_rect(fill = "#FFF8F7"),
panel.grid.major.y = element_blank(),
axis.text.x = element_text(
color = "#993333",
angle = 45,
size = 8
),
axis.text.y = element_text(colour = "#08007F", size = 8)
)
Figure 1.5: Subseries Plot
Below it is seen that a high value is present for the seasonal features.
myseries |>
features(Turnover, feat_stl)
## # A tibble: 1 × 11
## State Industry trend_strength seasonal_strength_year seasonal_peak_year
## <chr> <chr> <dbl> <dbl> <dbl>
## 1 South Austr… Hardwar… 0.991 0.872 9
## # ℹ 6 more variables: seasonal_trough_year <dbl>, spikiness <dbl>,
## # linearity <dbl>, curvature <dbl>, stl_e_acf1 <dbl>, stl_e_acf10 <dbl>
Here in figure 1.6, the colours indicate different months on the vertical axis. Most of the lag plots have shown positive correlation and hence, confirming strong seasonality in the data.
myseries |>
gg_lag(Turnover,
lags = 1:24,
geom = 'point',
size = 0.5) +
facet_wrap( ~ .lag, ncol = 6) +
ylab("Turnover ($Million AUD)") +
xlab("Lag") +
theme_minimal() +
theme(
plot.background = element_rect(fill = "#FFF8F7"),
panel.grid.major.y = element_blank(),
axis.text.x = element_text(
color = "#993333",
angle = 45,
size = 8
),
axis.text.y = element_text(colour = "#08007F", size = 8)
)
Figure 1.6: Lag plot
Autocorrelation is shown in figure 1.7, i.e. a linear relationship between lagged values of time series. It shows that:
There are peaks observed at every 12th lag, i.e. at the end of each month and hence, there is seasonality.This is also suggested by the fact that an equivalent amount of dip is observed (from the previous lag) at equal intervals.
The *Scalloped” shape observed is also due to seasonality.
This data has a trend as autocorrelations for the small lags are large and positive (as observations close in time are also close in value) and these positive values slowing decreases as lags increase.
myseries |>
ACF(Turnover, lag_max = 48) |>
autoplot() +
ylab("ACF") +
xlab("Lag [1 Month]") +
theme_minimal() +
theme(
plot.background = element_rect(fill = "#FFF8F7"),
panel.grid.major.y = element_blank(),
axis.text.x = element_text(
color = "#993333",
angle = 45,
size = 8
),
axis.text.y = element_text(colour = "#08007F", size = 8)
)
Figure 1.7: ACF
Below is a summary of seven autocorrelation features (in this order): first autocorrelation feature of the original data, sum of squares of first 10 coefficients from original data, first autocorrelation feature from differenced data, sum of squares of first 10 coefficients from differenced data, first autocorrelation feature from twice differenced data, sum of squares of first 10 coefficients from twice differenced data and autocorrelation coefficient at first seasonal lag.
myseries |>
features(Turnover, feat_acf)
## # A tibble: 1 × 9
## State Industry acf1 acf10 diff1_acf1 diff1_acf10 diff2_acf1 diff2_acf10
## <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 South Aust… Hardwar… 0.957 7.84 -0.143 0.0904 -0.501 0.267
## # ℹ 1 more variable: season_acf1 <dbl>
Transformations and difference is done to make the data stationary. There are 3 steps for this process:
Check the changing variance in the data, if visible, transform the data with an appropriate lambda value. Evaluate if the data is stationary and if no, move to the next step.
Seasonal difference to be carried out to remove seasonality, if the data is seasonal. This step also removes trend sometimes. Evaluate if the data is stationary and if no, move to the next step.
Perform regular difference to remove any trend and anything else leftover.
Figure 2.1 shows a time series by transforming turnover to log(turnover) and it seems that there is still some trend and seasonality in the data, so this is not an appropriate transformation.
myseries |>
autoplot(log(Turnover)) +
ylab("Log Turnover") +
xlab("Year Month") +
theme_minimal() +
theme(
plot.background = element_rect(fill = "#FFF8F7"),
panel.grid.major.y = element_blank(),
axis.text.x = element_text(
color = "#993333",
angle = 45,
size = 8
),
axis.text.y = element_text(colour = "#08007F", size = 8)
)
Figure 2.1: Log Transformation
Therefore, performing the Guerrero test ahead to find out lambda value for box-cox transformation.
lambda <- myseries |>
features(Turnover, features = guerrero)
lambda <- pull(lambda)
lambda
## [1] 0.4662629
In figure 2.2 box-cox transformation is performed on turnover with lambda= 0.4662629 and it still seems to have trend and seasonal pattern. The data is not stationary.
myseries |>
autoplot(box_cox(Turnover, lambda)) +
ylab("Transformed Turnover (lambda=0.467)") +
xlab("Year Month") +
theme_minimal() +
theme(
plot.background = element_rect(fill = "#FFF8F7"),
panel.grid.major.y = element_blank(),
axis.text.x = element_text(
color = "#993333",
angle = 45,
size = 8
),
axis.text.y = element_text(colour = "#08007F", size = 8)
)
Figure 2.2: Box-Cox Transformation
Figure 2.3 shows the time plot and ACF together and adds a new plot to the report, i.e. PACF (will be used in section 3).
myseries |>
gg_tsdisplay(box_cox(
Turnover,
lambda),plot_type = "partial")
Figure 2.3: Timeplot, ACF, PACF with Transform
There are still features of some trend and seasonality visible in the ACF, i.e. data is not stationary and therefore, requires seasonal difference.
Confirming conclusion from figure 2.3 using unit root KPSS test which defines Null Hypothesis Ho= Data is Stationary.
myseries |>
features(box_cox(
Turnover,
lambda), unitroot_kpss)
## # A tibble: 1 × 4
## State Industry kpss_stat kpss_pvalue
## <chr> <chr> <dbl> <dbl>
## 1 South Australia Hardware, building and garden supplies … 6.91 0.01
Here the p value is 0.01, which actually means that the value is < 0.01 (due to a limit applied) If p < 0.05, we need to difference the data. Hence, Ho is rejected.
Confirming if a seasonal difference is needed.
myseries |>
features(box_cox(
Turnover,
lambda), unitroot_nsdiffs)
## # A tibble: 1 × 3
## State Industry nsdiffs
## <chr> <chr> <int>
## 1 South Australia Hardware, building and garden supplies retailing 1
It is shown that 1 seasonal difference is needed.
In figure 2.3, ACF shows that, the there is seasonality and is a monthly lag data, and hence, the lag is used as 12 for the first seasonal difference.
myseries |>
gg_tsdisplay(difference(box_cox(
Turnover, lambda),
lag=12), plot_type = "partial")
Figure 2.4: After Transforming and Seasonal Differencing
From 2.4 can be said stationary as it is observed that:
In the ACF, the data falls to 0 quickly after the 11th lag.
The time plot shows unrelated values, i.e. not varying with time.
There seems to be no seasonality or trend left in the data.
This is confirmed by unit root KPSS tests below:
myseries |>
features(difference(box_cox(
Turnover, lambda), lag=12),
unitroot_kpss)
## # A tibble: 1 × 4
## State Industry kpss_stat kpss_pvalue
## <chr> <chr> <dbl> <dbl>
## 1 South Australia Hardware, building and garden supplies … 0.155 0.1
Here the p value is 0.1 (or > 0.1) and can be said stationary.
Further checking if more difference is required:
myseries |>
features(difference(box_cox(
Turnover, lambda), lag=12),
unitroot_ndiffs)
## # A tibble: 1 × 3
## State Industry ndiffs
## <chr> <chr> <int>
## 1 South Australia Hardware, building and garden supplies retailing 0
This shows that no more difference is needed, i.e. no need to perform step 3 of evaluating a regular difference. Hence, the data is now stationary.
For this a test and a train data set is prepared with test set being for the last 24 years, i.e. 2017 and 2018 and rest being the training set.
test <- myseries |>
filter(Month >= yearmonth("2017 Jan"))
train <- myseries |>
filter(Month < yearmonth("2017 Jan"))
Observing figure 2.2, we can say that error and seasonality are can be additive or multiplicative. There is somewhat an upward trend which can be called as additive. The appropriate models could be (MAM), (MNM) or (MAdM). Also here the seasonality component we can also use the additive component to the seasonality which leads to ETS models (M,N,A) or (A,Ad,A)
ets_fit <- train |>
model(
MAM = ETS(box_cox(Turnover, lambda) ~ error("M") + trend("A") + season("M")),
MNM = ETS(box_cox(Turnover, lambda) ~ error("M") + trend("N") + season("M")),
MAdM = ETS(
box_cox(Turnover, lambda) ~ error("M") + trend("Ad") + season("M")
),
MNA = ETS(box_cox(Turnover, lambda) ~ error("M") + trend("N") + season("A")),
MAdM = ETS(
box_cox(Turnover, lambda) ~ error("A") + trend("Ad") + season("A")
),
ets_auto = ETS(box_cox(Turnover, lambda))
)
glance(ets_fit)
## # A tibble: 5 × 11
## State Industry .model sigma2 log_lik AIC AICc BIC MSE AMSE MAE
## <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 South Au… Hardwar… MAM 0.00241 -933. 1900. 1901. 1968. 0.215 0.289 0.0363
## 2 South Au… Hardwar… MNM 0.00240 -930. 1890. 1892. 1951. 0.209 0.287 0.0365
## 3 South Au… Hardwar… MAdM 0.182 -894. 1825. 1827. 1897. 0.175 0.254 0.323
## 4 South Au… Hardwar… MNA 0.00214 -906. 1842. 1843. 1902. 0.176 0.251 0.0344
## 5 South Au… Hardwar… ets_a… 0.180 -893. 1816. 1818. 1877. 0.174 0.253 0.322
Here the results are discussed for the shortlisted 3 models and the automatic model chosen. It is observed that the AICc value is the lowest for the automatic ETS model created.
ets_fit |>
select(ets_auto) |>
report()
## Series: Turnover
## Model: ETS(A,N,A)
## Transformation: box_cox(Turnover, lambda)
## Smoothing parameters:
## alpha = 0.6642112
## gamma = 0.1496237
##
## Initial states:
## l[0] s[0] s[-1] s[-2] s[-3] s[-4] s[-5]
## 4.294015 -0.3972145 -0.5081342 -0.3191967 2.53842 0.3365877 0.1001217
## s[-6] s[-7] s[-8] s[-9] s[-10] s[-11]
## 0.07167334 -0.04089306 -0.3280601 -0.6257368 -0.2663899 -0.5611781
##
## sigma^2: 0.18
##
## AIC AICc BIC
## 1816.493 1817.690 1876.989
The best model reported here is (ANA)
ets_fit |>
forecast(h = 24) |>
accuracy(myseries)
## # A tibble: 5 × 12
## .model State Industry .type ME RMSE MAE MPE MAPE MASE RMSSE ACF1
## <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 MAM South… Hardwar… Test 8.74 10.1 8.74 8.93 8.93 1.45 1.18 0.348
## 2 MAdM South… Hardwar… Test 12.8 14.5 12.8 12.8 12.8 2.12 1.70 0.601
## 3 MNA South… Hardwar… Test 12.3 14.0 12.3 12.4 12.4 2.05 1.64 0.580
## 4 MNM South… Hardwar… Test 12.9 14.2 12.9 13.1 13.1 2.15 1.67 0.551
## 5 ets_auto South… Hardwar… Test 12.8 14.5 12.8 12.8 12.8 2.12 1.70 0.600
We observe that the models MAdM and ANA have the same results other than just one where the value for ANA is the lowest and hence we select ETS(ANA) as the best model.
Referring to figure 2.3, we find the (pdq) and (PDQ)12 (lag 12). We know that d=0, D=1 (seasonal difference)
For q we look at ACF plot and significant lags before 12 is only 1st, can be 2nd too. So q=1 or can be q=2. In PACF, significant lag is on 13th , so p=1 .
Hence (p,d,q)= (1,0,1) (2,0,1) (0,0,1) (2,0,0) (1,0,2) (2,0,2) (0,0,2) (2,0,0) Also, p=0 (when we only observe ACF) and q=0 (when we only observe PACF)
For seasonal values, we only look at seasonal lags, i.e. multiples of 12. Looking at ACF, Q=0 as ACF quickly decays to 0 and the first lag can be significant so Q=1. And from PACF, only lag at 12 and 24 seems to have significant lags, so P=2. Also even the first lag is not as significant so P=0
Hence (P,D,Q)= (2,1,0) (0,1,1)
Looking at the time plot, the average is not the center so a constant can be present.
arima_fit <- train |>
model(arima101210=ARIMA(box_cox(Turnover, lambda)~pdq(1,0,1)+ PDQ(2,1,0)),
arima201210=ARIMA(box_cox(Turnover, lambda)~pdq(2,0,1)+ PDQ(2,1,0)),
arima001210=ARIMA(box_cox(Turnover, lambda)~pdq(0,0,1)+ PDQ(2,1,0)),
arima201210=ARIMA(box_cox(Turnover, lambda)~pdq(2,0,1)+ PDQ(2,1,0)),
arima101011=ARIMA(box_cox(Turnover, lambda)~pdq(1,0,1)+ PDQ(0,1,1)),
arima102111=ARIMA(box_cox(Turnover, lambda)~pdq(1,0,2)+ PDQ(1,1,1)),
arima_auto=ARIMA(box_cox(Turnover, lambda), trace=TRUE))
## Model specification Selection metric
## ARIMA(2,0,2)(1,1,1)[12]+c 2516.509675
## ARIMA(0,0,0)(0,1,0)[12]+c 3102.480602
## ARIMA(1,0,0)(1,1,0)[12]+c 2590.688591
## ARIMA(0,0,1)(0,1,1)[12]+c 2853.611292
## ARIMA(2,0,2)(0,1,1)[12]+c 2506.156791
## ARIMA(2,0,2)(0,1,0)[12]+c 2673.939059
## ARIMA(1,0,2)(0,1,1)[12]+c 2503.103318
## ARIMA(1,0,2)(0,1,0)[12]+c 2671.002491
## ARIMA(0,0,2)(0,1,1)[12]+c 2747.122528
## ARIMA(1,0,1)(0,1,1)[12]+c 2507.648728
## ARIMA(1,0,2)(0,1,1)[12] 2503.699989
## ARIMA(1,0,3)(0,1,1)[12]+c 2505.174378
## ARIMA(1,0,2)(0,1,2)[12]+c 2505.170039
## ARIMA(1,0,2)(1,1,1)[12]+c 2510.213316
## ARIMA(0,0,2)(0,1,0)[12]+c 2783.907628
## ARIMA(1,0,1)(0,1,0)[12]+c 2671.501861
## ARIMA(1,0,2)(0,1,0)[12] 2671.300571
## ARIMA(1,0,3)(0,1,0)[12]+c 2672.738475
## ARIMA(0,0,2)(0,1,1)[12] 2784.501687
## ARIMA(0,0,3)(0,1,1)[12]+c 2688.354160
## ARIMA(1,0,1)(0,1,1)[12] 2509.200207
## ARIMA(1,0,3)(0,1,1)[12] 2505.750114
## ARIMA(2,0,1)(0,1,1)[12]+c 2505.334854
## ARIMA(2,0,2)(0,1,1)[12] 2506.660164
## ARIMA(2,0,3)(0,1,1)[12]+c 2508.231499
## ARIMA(0,0,2)(0,1,2)[12]+c 2746.043150
## ARIMA(1,0,1)(0,1,2)[12]+c 2509.659590
## ARIMA(1,0,2)(0,1,2)[12] 2505.758751
## ARIMA(1,0,3)(0,1,2)[12]+c 2507.251548
## ARIMA(2,0,2)(0,1,2)[12]+c 2508.235563
## ARIMA(1,0,2)(1,1,0)[12]+c 2576.589825
## ARIMA(0,0,2)(1,1,1)[12]+c 2750.877321
## ARIMA(1,0,1)(1,1,1)[12]+c 2513.673285
## ARIMA(1,0,2)(1,1,1)[12] 2510.648528
## ARIMA(1,0,3)(1,1,1)[12]+c 2511.680653
## ARIMA(1,0,2)(1,1,2)[12]+c 2505.742034
##
## --- Re-estimating best models without approximation ---
##
## ARIMA(1,0,2)(0,1,1)[12]+c 472.388817
tidy(arima_fit)
## # A tibble: 29 × 8
## State Industry .model term estimate std.error statistic p.value
## <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
## 1 South Australia Hardware… arima… ar1 0.945 0.0183 51.7 1.46e-180
## 2 South Australia Hardware… arima… ma1 -0.258 0.0612 -4.22 3.05e- 5
## 3 South Australia Hardware… arima… sar1 -0.588 0.0488 -12.0 9.63e- 29
## 4 South Australia Hardware… arima… sar2 -0.270 0.0495 -5.47 8.03e- 8
## 5 South Australia Hardware… arima… cons… 0.0308 0.0163 1.89 5.96e- 2
## 6 South Australia Hardware… arima… ar1 1.27 0.129 9.80 1.73e- 20
## 7 South Australia Hardware… arima… ar2 -0.290 0.123 -2.36 1.87e- 2
## 8 South Australia Hardware… arima… ma1 -0.557 0.111 -5.01 8.08e- 7
## 9 South Australia Hardware… arima… sar1 -0.595 0.0487 -12.2 2.14e- 29
## 10 South Australia Hardware… arima… sar2 -0.265 0.0495 -5.35 1.47e- 7
## # ℹ 19 more rows
Here the estimate parameter shows that it has some constants.
glance(arima_fit)
## # A tibble: 6 × 10
## State Industry .model sigma2 log_lik AIC AICc BIC ar_roots ma_roots
## <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <list> <list>
## 1 South Aust… Hardwar… arima… 0.208 -257. 526. 526. 550. <cpl> <cpl>
## 2 South Aust… Hardwar… arima… 0.207 -256. 524. 525. 548. <cpl> <cpl>
## 3 South Aust… Hardwar… arima… 0.444 -409. 827. 827. 847. <cpl> <cpl>
## 4 South Aust… Hardwar… arima… 0.182 -233. 477. 477. 497. <cpl> <cpl>
## 5 South Aust… Hardwar… arima… 0.180 -230. 474. 474. 502. <cpl> <cpl>
## 6 South Aust… Hardwar… arima… 0.179 -230. 472. 472. 496. <cpl> <cpl>
This shows the lowest AICc value is for the automatic arima model which is just slightly lower than (102)(111) model
arima_fit |>
select(arima_auto) |>
report()
## Series: Turnover
## Model: ARIMA(1,0,2)(0,1,1)[12] w/ drift
## Transformation: box_cox(Turnover, lambda)
##
## Coefficients:
## ar1 ma1 ma2 sma1 constant
## 0.9742 -0.2652 -0.1226 -0.7777 0.0078
## s.e. 0.0124 0.0511 0.0480 0.0331 0.0029
##
## sigma^2 estimated as 0.1794: log likelihood=-230.09
## AIC=472.18 AICc=472.39 BIC=496.2
The automatic ARIMA reported is (102)(011) which is the best model.
arima_fit |>
forecast(h = 24) |>
accuracy(myseries)
## # A tibble: 6 × 12
## .model State Industry .type ME RMSE MAE MPE MAPE MASE RMSSE ACF1
## <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 arima001… Sout… Hardwar… Test 8.03 9.74 8.03 7.96 7.96 1.34 1.14 0.508
## 2 arima101… Sout… Hardwar… Test 9.46 10.7 9.46 9.63 9.63 1.57 1.26 0.372
## 3 arima101… Sout… Hardwar… Test 8.71 10.2 8.71 8.75 8.75 1.45 1.19 0.413
## 4 arima102… Sout… Hardwar… Test 9.88 11.1 9.88 10.0 10.0 1.64 1.31 0.403
## 5 arima201… Sout… Hardwar… Test 10.4 11.8 10.4 10.5 10.5 1.73 1.39 0.496
## 6 arima_au… Sout… Hardwar… Test 9.93 11.2 9.93 10.1 10.1 1.65 1.31 0.406
The best ETS model chosen is ANA and the best ARIMA model is (102)(011)
ets_fit |>
select(ets_auto) |>
tidy()
## # A tibble: 15 × 3
## .model term estimate
## <chr> <chr> <dbl>
## 1 ets_auto alpha 0.664
## 2 ets_auto gamma 0.150
## 3 ets_auto l[0] 4.29
## 4 ets_auto s[0] -0.397
## 5 ets_auto s[-1] -0.508
## 6 ets_auto s[-2] -0.319
## 7 ets_auto s[-3] 2.54
## 8 ets_auto s[-4] 0.337
## 9 ets_auto s[-5] 0.100
## 10 ets_auto s[-6] 0.0717
## 11 ets_auto s[-7] -0.0409
## 12 ets_auto s[-8] -0.328
## 13 ets_auto s[-9] -0.626
## 14 ets_auto s[-10] -0.266
## 15 ets_auto s[-11] -0.561
arima_fit |>
select(arima_auto) |>
tidy()
## # A tibble: 5 × 6
## .model term estimate std.error statistic p.value
## <chr> <chr> <dbl> <dbl> <dbl> <dbl>
## 1 arima_auto ar1 0.974 0.0124 78.8 1.13e-247
## 2 arima_auto ma1 -0.265 0.0511 -5.19 3.37e- 7
## 3 arima_auto ma2 -0.123 0.0480 -2.55 1.10e- 2
## 4 arima_auto sma1 -0.778 0.0331 -23.5 1.21e- 77
## 5 arima_auto constant 0.00779 0.00289 2.69 7.33e- 3
Figure 4.1 shows residuals for best ETS model. It is observed that it is fairly normally distributed. ACF shows somewhat white noise, i.e. there are some residuals left but they are not significant.
gg_tsresiduals(ets_fit |>
select(ets_auto))
Figure 4.1: Residuals ETS
ets_fit |>
select(State, Industry, ets_auto) |>
forecast(h = 24) |>
autoplot(myseries) +
ylab("Turnover ($Million AUD)") +
xlab("Year Month") +
theme_minimal() +
theme(
plot.background = element_rect(fill = "#FFF8F7"),
panel.grid.major.y = element_blank(),
axis.text.x = element_text(
color = "#993333",
angle = 45,
size = 8
),
axis.text.y = element_text(colour = "#08007F", size = 8)
)
Figure 4.2: Forecast for ETS best model ANA
ets_fit |>
select(ets_auto) |>
augment()
## # A tsibble: 417 x 6 [1M]
## # Key: .model [1]
## .model Month Turnover .fitted .resid .innov
## <chr> <mth> <dbl> <dbl> <dbl> <dbl>
## 1 ets_auto 1982 Apr 8.6 8.69 -0.0899 -0.0284
## 2 ets_auto 1982 May 9.5 9.59 -0.0883 -0.0265
## 3 ets_auto 1982 Jun 8.4 8.37 0.0274 0.00879
## 4 ets_auto 1982 Jul 10.3 9.34 0.955 0.282
## 5 ets_auto 1982 Aug 10.6 11.0 -0.379 -0.107
## 6 ets_auto 1982 Sep 11.5 11.1 0.370 0.101
## 7 ets_auto 1982 Oct 10.8 11.5 -0.679 -0.188
## 8 ets_auto 1982 Nov 13.1 11.9 1.21 0.313
## 9 ets_auto 1982 Dec 25.4 22.8 2.61 0.478
## 10 ets_auto 1983 Jan 9.2 11.4 -2.21 -0.638
## # ℹ 407 more rows
ets_fit |>
select(ets_auto) |>
forecast(h = 24) |>
hilo(level = 95)
## # A tsibble: 24 x 5 [1M]
## # Key: .model [1]
## .model Month Turnover .mean `95%`
## <chr> <mth> <dist> <dbl> <hilo>
## 1 ets_auto 2017 Jan t(N(15, 0.18)) 82.5 [73.87755, 91.39686]95
## 2 ets_auto 2017 Feb t(N(14, 0.26)) 71.4 [61.94528, 81.42036]95
## 3 ets_auto 2017 Mar t(N(14, 0.34)) 79.6 [68.15389, 91.73461]95
## 4 ets_auto 2017 Apr t(N(14, 0.42)) 78.2 [65.69555, 91.65149]95
## 5 ets_auto 2017 May t(N(14, 0.5)) 76.3 [62.82308, 90.74869]95
## 6 ets_auto 2017 Jun t(N(14, 0.58)) 76.1 [61.65714, 91.68188]95
## 7 ets_auto 2017 Jul t(N(14, 0.66)) 76.5 [61.14928, 93.27143]95
## 8 ets_auto 2017 Aug t(N(14, 0.74)) 80.1 [63.44050, 98.28388]95
## 9 ets_auto 2017 Sep t(N(15, 0.82)) 85.3 [67.18874, 105.11464]95
## 10 ets_auto 2017 Oct t(N(16, 0.89)) 93.0 [73.15791, 114.76831]95
## # ℹ 14 more rows
Zooming in on figure 4.1, figure 4.3 suggests autocorrelation.
augment(ets_fit) |>
filter(.model == "ets_auto") |>
ACF(.innov) |>
autoplot() +
theme_minimal() +
theme(
plot.background = element_rect(fill = "#FFF8F7"),
panel.grid.major.y = element_blank(),
axis.text.x = element_text(
color = "#993333",
angle = 45,
size = 8
),
axis.text.y = element_text(colour = "#08007F", size = 8)
)
Figure 4.3: ACF Diagnostics for ETS best model
This test uses the Hypothesis Ho that the residuals are independently distributed.
augment(ets_fit) |>
filter(.model == "ets_auto") |>
features(.innov, ljung_box, lag = 24, dof = 18)
## # A tibble: 1 × 5
## State Industry .model lb_stat lb_pvalue
## <chr> <chr> <chr> <dbl> <dbl>
## 1 South Australia Hardware, building and garden suppli… ets_a… 28.5 0.0000764
Here the p value is <0.05, suggesting that the values are auto correlated, rejecting the null hypothesis.
Figure 4.4 shows a normal distribution and white noise.
gg_tsresiduals(arima_fit |>
select(arima_auto))
Figure 4.4: Residuals ARIMA
arima_fit |>
select(State, Industry, arima_auto) |>
forecast(h = 24) |>
autoplot(myseries) +
ylab("Turnover ($Million AUD)") +
xlab("Year Month") +
theme_minimal() +
theme(
plot.background = element_rect(fill = "#FFF8F7"),
panel.grid.major.y = element_blank(),
axis.text.x = element_text(
color = "#993333",
angle = 45,
size = 8
),
axis.text.y = element_text(colour = "#08007F", size = 8)
)
Figure 4.5: Forecast for best ARIMA model (102)(011)
arima_fit |>
select(arima_auto) |>
augment()
## # A tsibble: 417 x 6 [1M]
## # Key: .model [1]
## .model Month Turnover .fitted .resid .innov
## <chr> <mth> <dbl> <dbl> <dbl> <dbl>
## 1 arima_auto 1982 Apr 8.6 8.59 0.0107 0.00340
## 2 arima_auto 1982 May 9.5 9.49 0.0122 0.00368
## 3 arima_auto 1982 Jun 8.4 8.39 0.0104 0.00334
## 4 arima_auto 1982 Jul 10.3 10.3 0.0136 0.00392
## 5 arima_auto 1982 Aug 10.6 10.6 0.0141 0.00400
## 6 arima_auto 1982 Sep 11.5 11.5 0.0157 0.00425
## 7 arima_auto 1982 Oct 10.8 10.8 0.0144 0.00406
## 8 arima_auto 1982 Nov 13.1 13.1 0.0184 0.00467
## 9 arima_auto 1982 Dec 25.4 25.4 0.0407 0.00725
## 10 arima_auto 1983 Jan 9.2 9.19 0.0117 0.00359
## # ℹ 407 more rows
arima_fit |>
select(arima_auto) |>
forecast(h = 24) |>
hilo(level = 95)
## # A tsibble: 24 x 5 [1M]
## # Key: .model [1]
## .model Month Turnover .mean `95%`
## <chr> <mth> <dist> <dbl> <hilo>
## 1 arima_auto 2017 Jan t(N(15, 0.18)) 81.4 [72.85991, 90.22765]95
## 2 arima_auto 2017 Feb t(N(14, 0.27)) 71.7 [62.03240, 81.92949]95
## 3 arima_auto 2017 Mar t(N(14, 0.33)) 80.2 [68.87268, 92.14720]95
## 4 arima_auto 2017 Apr t(N(14, 0.38)) 78.7 [66.61905, 91.51518]95
## 5 arima_auto 2017 May t(N(14, 0.43)) 76.0 [63.45951, 89.51768]95
## 6 arima_auto 2017 Jun t(N(14, 0.48)) 75.2 [61.98916, 89.31625]95
## 7 arima_auto 2017 Jul t(N(14, 0.53)) 75.8 [62.00954, 90.76842]95
## 8 arima_auto 2017 Aug t(N(14, 0.58)) 80.4 [65.57069, 96.46289]95
## 9 arima_auto 2017 Sep t(N(15, 0.62)) 87.1 [71.04251, 104.44180]95
## 10 arima_auto 2017 Oct t(N(16, 0.66)) 96.5 [78.98670, 115.39304]95
## # ℹ 14 more rows
Figure 4.6 shows that most residuals are within the blue line, showing white noise.
augment(arima_fit) |>
filter(.model == "arima_auto") |>
ACF(.innov) |>
autoplot() +
theme_minimal() +
theme(
plot.background = element_rect(fill = "#FFF8F7"),
panel.grid.major.y = element_blank(),
axis.text.x = element_text(
color = "#993333",
angle = 45,
size = 8
),
axis.text.y = element_text(colour = "#08007F", size = 8)
)
Figure 4.6: ACF for best fit ARIMA model
augment(arima_fit) |>
filter(.model == "arima_auto") |>
features(.innov, ljung_box, lag = 36, dof = 24)
## # A tibble: 1 × 5
## State Industry .model lb_stat lb_pvalue
## <chr> <chr> <chr> <dbl> <dbl>
## 1 South Australia Hardware, building and garden suppli… arima… 36.7 0.000248
Here, p value is <0.05 rejecting the Null Hypothesis.
Both the models show somewhat a normal distribution and white noise in the data (figure 4.1 and 4.4.
The forecasts are similar for both models, but when augmented, the ETS has more negative values in .resid and .innov.
The prediction intervals in the ARIMA model are narrower than the ETS model.
Both show a value <0.05 in the Ljung test.
both_best <- train |>
model(ets_auto = ETS(box_cox(Turnover, lambda)),
arima_auto = ARIMA(box_cox(Turnover, lambda)))
both_best |>
forecast(h = 24) |>
autoplot(test) +
ylab("Turnover ($Million AUD)") +
xlab("Year Month") +
theme_minimal() +
theme(
plot.background = element_rect(fill = "#FFF8F7"),
panel.grid.major.y = element_blank(),
axis.text.x = element_text(
color = "#993333",
angle = 45,
size = 8
),
axis.text.y = element_text(colour = "#08007F", size = 8)
)
Figure 5.1: Best model forecast for test set
Comparing figure 5.1 with 4.2 which is the overall forecast and comparison between both best models too, it shows that prediction intervals are narrow in best ARIMA.
both_best |>
forecast(h = 24) |>
accuracy(myseries)
## # A tibble: 2 × 12
## .model State Industry .type ME RMSE MAE MPE MAPE MASE RMSSE ACF1
## <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 arima_au… Sout… Hardwar… Test 9.93 11.2 9.93 10.1 10.1 1.65 1.31 0.406
## 2 ets_auto Sout… Hardwar… Test 12.8 14.5 12.8 12.8 12.8 2.12 1.70 0.600
The accuracy values are lower for best arima model, suggesting better forecasts.
both_best2 <- myseries |>
model(ets_ANA = ETS(box_cox(Turnover, lambda)),
arima_102011 = ARIMA(box_cox(Turnover, lambda)))
both_best2 |>
tidy()
## # A tibble: 20 × 8
## State Industry .model term estimate std.error statistic p.value
## <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
## 1 South Australia Hardwar… ets_A… alpha 0.659 NA NA NA
## 2 South Australia Hardwar… ets_A… gamma 0.159 NA NA NA
## 3 South Australia Hardwar… ets_A… l[0] 4.28 NA NA NA
## 4 South Australia Hardwar… ets_A… s[0] -0.406 NA NA NA
## 5 South Australia Hardwar… ets_A… s[-1] -0.499 NA NA NA
## 6 South Australia Hardwar… ets_A… s[-2] -0.314 NA NA NA
## 7 South Australia Hardwar… ets_A… s[-3] 2.53 NA NA NA
## 8 South Australia Hardwar… ets_A… s[-4] 0.321 NA NA NA
## 9 South Australia Hardwar… ets_A… s[-5] 0.0878 NA NA NA
## 10 South Australia Hardwar… ets_A… s[-6] 0.0627 NA NA NA
## 11 South Australia Hardwar… ets_A… s[-7] -0.0421 NA NA NA
## 12 South Australia Hardwar… ets_A… s[-8] -0.317 NA NA NA
## 13 South Australia Hardwar… ets_A… s[-9] -0.604 NA NA NA
## 14 South Australia Hardwar… ets_A… s[-1… -0.256 NA NA NA
## 15 South Australia Hardwar… ets_A… s[-1… -0.559 NA NA NA
## 16 South Australia Hardwar… arima… ar1 0.972 0.0124 78.6 6.60e-257
## 17 South Australia Hardwar… arima… ma1 -0.254 0.0496 -5.13 4.39e- 7
## 18 South Australia Hardwar… arima… ma2 -0.143 0.0461 -3.09 2.14e- 3
## 19 South Australia Hardwar… arima… sma1 -0.767 0.0326 -23.5 2.81e- 79
## 20 South Australia Hardwar… arima… cons… 0.00889 0.00289 3.07 2.27e- 3
gg_tsresiduals(both_best2 |>
select(ets_ANA)) +
ggtitle("ETS ANA Residuals on full dataset")
gg_tsresiduals(both_best2 |>
select(arima_102011)) +
ggtitle("ARIMA (102)(011) Residuals on full dataset")
both_best2 |>
select(ets_ANA) |>
augment()
## # A tsibble: 441 x 6 [1M]
## # Key: .model [1]
## .model Month Turnover .fitted .resid .innov
## <chr> <mth> <dbl> <dbl> <dbl> <dbl>
## 1 ets_ANA 1982 Apr 8.6 8.64 -0.0399 -0.0126
## 2 ets_ANA 1982 May 9.5 9.60 -0.0961 -0.0288
## 3 ets_ANA 1982 Jun 8.4 8.41 -0.0113 -0.00364
## 4 ets_ANA 1982 Jul 10.3 9.32 0.977 0.289
## 5 ets_ANA 1982 Aug 10.6 10.9 -0.323 -0.0909
## 6 ets_ANA 1982 Sep 11.5 11.1 0.416 0.114
## 7 ets_ANA 1982 Oct 10.8 11.4 -0.650 -0.180
## 8 ets_ANA 1982 Nov 13.1 11.9 1.22 0.318
## 9 ets_ANA 1982 Dec 25.4 22.8 2.61 0.478
## 10 ets_ANA 1983 Jan 9.2 11.5 -2.26 -0.652
## # ℹ 431 more rows
both_best2 |>
select(arima_102011) |>
augment()
## # A tsibble: 441 x 6 [1M]
## # Key: .model [1]
## .model Month Turnover .fitted .resid .innov
## <chr> <mth> <dbl> <dbl> <dbl> <dbl>
## 1 arima_102011 1982 Apr 8.6 8.59 0.0107 0.00338
## 2 arima_102011 1982 May 9.5 9.49 0.0122 0.00366
## 3 arima_102011 1982 Jun 8.4 8.39 0.0103 0.00332
## 4 arima_102011 1982 Jul 10.3 10.3 0.0135 0.00390
## 5 arima_102011 1982 Aug 10.6 10.6 0.0140 0.00398
## 6 arima_102011 1982 Sep 11.5 11.5 0.0156 0.00423
## 7 arima_102011 1982 Oct 10.8 10.8 0.0144 0.00404
## 8 arima_102011 1982 Nov 13.1 13.1 0.0184 0.00465
## 9 arima_102011 1982 Dec 25.4 25.4 0.0406 0.00723
## 10 arima_102011 1983 Jan 9.2 9.19 0.0117 0.00357
## # ℹ 431 more rows
Figure 6.1 shows forecast of two best models on full data set with 80% prediction interval.
both_best2 |>
forecast(h = 24) |>
autoplot(myseries, level = 80) +
ylab("Turnover ($Million AUD)") +
xlab("Year Month") +
theme_minimal() +
theme(
plot.background = element_rect(fill = "#FFF8F7"),
panel.grid.major.y = element_blank(),
axis.text.x = element_text(
color = "#993333",
angle = 45,
size = 8
),
axis.text.y = element_text(colour = "#08007F", size = 8)
)
Figure 6.1: 80% interval forecast by best models
abs_retail_data <- read_abs("8501.0", tables="11")
abs <- abs_retail_data |>
separate(series,
into = c("extra", "State", "Industry"),
sep = ";") |>
mutate(
Month = yearmonth(date),
State = trimws(State),
Industry = trimws(Industry)
) |>
select(Month, State, value, Industry) |>
filter(State == "South Australia") |>
filter(Industry == "Hardware, building and garden supplies retailing") |>
rename(Turnover = value) |>
filter(Month > yearmonth("2018 Dec"))
Comparing figures 7.1 and 7.2, we see that ETS model has given lagged values whereas arima model values are closer to the actual values.
abs |>
ggplot(aes(x = Month, y = Turnover)) +
geom_line() +
ylab("Turnover ($Million AUD)") +
xlab("Year Month") +
theme_minimal() +
theme(
plot.background = element_rect(fill = "#FFF8F7"),
panel.grid.major.y = element_blank(),
axis.text.x = element_text(
color = "#993333",
angle = 45,
size = 8
),
axis.text.y = element_text(colour = "#08007F", size = 8)
)
Figure 7.1: ABS data plot
both_best2 |>
forecast(h=51) |>
autoplot()+
ylab("Turnover ($Million AUD)")+
xlab("Year Month")+
theme_minimal()+
theme( plot.background = element_rect(fill = "#FFF8F7"),
panel.grid.major.y = element_blank(),
axis.text.x = element_text(color="#993333", angle=45, size=8),
axis.text.y = element_text(colour= "#08007F", size=8))
Figure 7.2: Forecast from myseries data
The ABS data contain 51 more months than the myseries data. And hence, forecasting is performed for 51 months in figure 7.2.
The ETS model chosen was ANA and ARIMS model chosen was (102)(011).
ETS models are considered to be stationary and ARIMA models are considered non-stationary.
ARIMA models fits the training data slightly better than ETS.
ETS modeling gives more significance to recent observation.
ETS was unpredictable as it predicted ANA and could not handle trend well.
The value of ETS forecasts when compared to the actual values, were lagging behind as compared to ARIMA forecasts.
Outliers affect ARIMA modeling results and hence a duplicate test was done before the modeling process.
Below the accuracy for both models are compared with reference to the test set and it is see that ARIMA has better values and therefore, slightly more accurate.
bind_rows(
arima_fit |> select(arima_auto) |> accuracy(),
ets_fit |> select(ets_auto) |> accuracy(),
arima_fit |> select(arima_auto) |> forecast(h = 10) |> accuracy(test),
ets_fit |> select(ets_auto) |> forecast(h = 10) |> accuracy(test)
)
## # A tibble: 4 × 10
## .model .type ME RMSE MAE MPE MAPE MASE RMSSE ACF1
## <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 arima_auto Training -0.0319 3.58 2.52 -0.230 5.58 0.419 0.421 0.0618
## 2 ets_auto Training 0.257 3.62 2.59 0.437 5.75 0.430 0.425 0.0887
## 3 arima_auto Test 7.96 9.32 7.96 8.64 8.64 NaN NaN 0.135
## 4 ets_auto Test 8.35 10.2 8.35 8.96 8.96 NaN NaN 0.176
Australian retail trade turnover. rdrr.io. https://rdrr.io/cran/tsibbledata/man/aus_retail.html
Chang, W. (2023, May 16). 13.13 Creating a QQ Plot | R Graphics Cookbook, 2nd edition. https://r-graphics.org/recipe-miscgraph-qq
Coder, R. (2021). Histogram with density in ggplot2. R CHARTS | a Collection of Charts and Graphs Made With the R Programming Language. https://r-charts.com/distribution/histogram-density-ggplot2/
Forecasting: Principles and Practice (3rd ed). (n.d.). https://otexts.com/fpp3/
Holtz, Y. (n.d.). The R Graph Gallery – Help and inspiration for R charts. The R Graph Gallery. https://r-graph-gallery.com/
Retail Trade, Australia, March 2023. (2023, May 9). Australian Bureau of Statistics. https://www.abs.gov.au/statistics/industry/retail-and-wholesale-trade/retail-trade-australia/latest-release
Robjhyndman. (n.d.). fpp3package/README.Rmd at master · robjhyndman/fpp3package. GitHub. https://github.com/robjhyndman/fpp3package/blob/master/README.Rmd
Roy, A. (2023). What is Sales Forecasting? CX Today. https://www.cxtoday.com/contact-centre/what-is-sales-forecasting/